Goto

Collaborating Authors

 one-hidden layer neural network


Review for NeurIPS paper: Minimax Lower Bounds for Transfer Learning with Linear and One-hidden Layer Neural Networks

Neural Information Processing Systems

I think the transfer distance can be interpreted as a measure of transferability, and the transfer distance defined in the paper seems to suggest that transfer learning is possible only when W_S and W_T are close to each other under the \Sigma_T norm. I understand that this definition is motivated from the proposition 1, but it is not always the case how people apply transfer learning in practice. In over-parametrized neural networks, two very different weights could both generate good performance model, but some learned features mappings can still be transferred to various tasks. Thus, I believe the transfer distance defined here does not fully characterize the transferability people discussed in general. Since the lower bound is not just characterizing the rate of the convergence, I would like to see the phase transition behavior of the bound between different regimes, and discontinuity would suggest that the lower bound is not tight at these points.


Review for NeurIPS paper: Minimax Lower Bounds for Transfer Learning with Linear and One-hidden Layer Neural Networks

Neural Information Processing Systems

This paper addresses the problem of inductive transfer with one-hidden-layer neural networks or linear models and proposes minimax lower bounds for these models. Three reviewers and AC agree that it is a well written paper which studies an important problem. The proposed fine-grained minimax rate for transfer learning is a nice contribution to this field. Although the setting is somewhat simple, this work is inspiring for studying inductive transfer with neural networks. There are still some minor concerns on the organization of the paper and the evaluation of the proposed lower bound, which should be fully addressed in the camera-ready version.


Minimax Lower Bounds for Transfer Learning with Linear and One-hidden Layer Neural Networks

Neural Information Processing Systems

Transfer learning has emerged as a powerful technique for improving the performance of machine learning models on new domains where labeled training data may be scarce. In this approach a model trained for a source task, where plenty of labeled training data is available, is used as a starting point for training a model on a related target task with only few labeled training data. Despite recent empirical success of transfer learning approaches, the benefits and fundamental limits of transfer learning are poorly understood. In this paper we develop a statistical minimax framework to characterize the fundamental limits of transfer learning in the context of regression with linear and one-hidden layer neural network models. Specifically, we derive a lower-bound for the target generalization error achievable by any algorithm as a function of the number of labeled source and target data as well as appropriate notions of similarity between the source and target tasks.